[AMDGPU] Implement CFI for CSR spills #164724

slinder1 · 2025-10-22T22:28:08Z

Introduce new SPILL pseudos to allow CFI to be generated for only CSR
spills, and to make ISA-instruction-level accurate information.

Other targets either generate slightly incorrect information or rely on
conventions for how spills are placed within the entry block. The
approach in this change produces larger unwind tables, with the
increased size being spent on additional DW_CFA_advance_location
instructions needed to describe the unwinding accurately.

Co-authored-by: Scott Linder [email protected]
Co-authored-by: Venkata Ramanaiah Nalamothu [email protected]

AMDGPU requires more complex CFI rules, normally these would be expressed with .cfi_escape, however this would make the CFI unreadable and makes it difficult to update registers in CFI instructions (also something AMDGPU requires).

While these can be represented with .cfi_escape, using these pseudo-cfi instructions makes .s/.mir files more readable, and it is necessary to support updating registers in CFI instructions (something that the AMDGPU backend requires).

Entry functions represent the end of unwinding, as they are the outer-most frame. This implies they can only have a meaningful definition for the CFA, which AMDGPU defines using a memory location description with a literal private address space address. The return address is set to undefined as a sentinel value to signal the end of unwinding. Co-authored-by: Scott Linder <[email protected]> Co-authored-by: Venkata Ramanaiah Nalamothu <[email protected]>

This does not implement CSR spills other than those AMDGPU handles during PEI. The remaining spills are handled in a subsequent patch. Co-authored-by: Scott Linder <[email protected]> Co-authored-by: Venkata Ramanaiah Nalamothu <[email protected]>

Introduce new SPILL pseudos to allow CFI to be generated for only CSR spills, and to make ISA-instruction-level accurate information. Other targets either generate slightly incorrect information or rely on conventions for how spills are placed within the entry block. The approach in this change produces larger unwind tables, with the increased size being spent on additional DW_CFA_advance_location instructions needed to describe the unwinding accurately. Co-authored-by: Scott Linder <[email protected]> Co-authored-by: Venkata Ramanaiah Nalamothu <[email protected]>

slinder1 · 2025-10-22T22:28:26Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

[AMDGPU][MC] Replace shifted registers in CFI instructions #164726
[AMDGPU] Implement -amdgpu-spill-cfi-saved-regs #164725
[AMDGPU] Implement CFI for CSR spills #164724 👈 (View in Graphite)
[AMDGPU] Implement CFI for non-kernel functions #164723
[AMDGPU] Emit entry function Dwarf CFI #164722
[MC][Dwarf] Add custom CFI pseudo-ops for use in AMDGPU #164721 : 1 other dependent PR (#166464 )
[MC] Use a variant to hold MCCFIInstruction state (NFC) #164720
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-10-22T22:39:43Z

@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-llvm-globalisel

Author: Scott Linder (slinder1)

Changes

Introduce new SPILL pseudos to allow CFI to be generated for only CSR
spills, and to make ISA-instruction-level accurate information.

Other targets either generate slightly incorrect information or rely on
conventions for how spills are placed within the entry block. The
approach in this change produces larger unwind tables, with the
increased size being spent on additional DW_CFA_advance_location
instructions needed to describe the unwinding accurately.

Co-authored-by: Scott Linder <[email protected]>
Co-authored-by: Venkata Ramanaiah Nalamothu <[email protected]>

Patch is 4.01 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/164724.diff

101 Files Affected:

(modified) llvm/lib/Target/AMDGPU/SIFrameLowering.cpp (+93-6)
(modified) llvm/lib/Target/AMDGPU/SIFrameLowering.h (+22)
(modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+92-45)
(modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+18-1)
(modified) llvm/lib/Target/AMDGPU/SIInstructions.td (+18)
(modified) llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp (+36-54)
(modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp (+203-20)
(modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.h (+11-4)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/assert-align.ll (+3-3)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll (+14-14)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/localizer.ll (+4-4)
(modified) llvm/test/CodeGen/AMDGPU/a-v-flat-atomicrmw.ll (+4-4)
(modified) llvm/test/CodeGen/AMDGPU/a-v-global-atomicrmw.ll (+4-4)
(modified) llvm/test/CodeGen/AMDGPU/abi-attribute-hints-undefined-behavior.ll (+3-3)
(modified) llvm/test/CodeGen/AMDGPU/accvgpr-spill-scc-clobber.mir (+2688)
(modified) llvm/test/CodeGen/AMDGPU/agpr-copy-no-free-registers.ll (+4-3)
(modified) llvm/test/CodeGen/AMDGPU/amdgcn-call-whole-wave.ll (+14-14)
(modified) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll (+5621-3349)
(modified) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.256bit.ll (+5-5)
(modified) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.320bit.ll (+26-26)
(modified) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll (+1130-1130)
(modified) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.576bit.ll (+114-114)
(modified) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.640bit.ll (+609-33)
(modified) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.704bit.ll (+649-57)
(modified) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.768bit.ll (+736-128)
(modified) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll (+878-262)
(modified) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll (+1054-438)
(modified) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll (+1229-613)
(modified) llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-preserve-cc.ll (+2)
(modified) llvm/test/CodeGen/AMDGPU/amdgpu-simplify-libcall-pow-codegen.ll (+154-154)
(modified) llvm/test/CodeGen/AMDGPU/attributor-flatscratchinit-undefined-behavior2.ll (+19-18)
(modified) llvm/test/CodeGen/AMDGPU/av_spill_cross_bb_usage.mir (+12)
(modified) llvm/test/CodeGen/AMDGPU/bf16.ll (+214-208)
(modified) llvm/test/CodeGen/AMDGPU/branch-relax-spill.ll (+78-78)
(modified) llvm/test/CodeGen/AMDGPU/call-args-inreg.ll (+144-144)
(modified) llvm/test/CodeGen/AMDGPU/call-argument-types.ll (+68-68)
(modified) llvm/test/CodeGen/AMDGPU/call-graph-register-usage.ll (+1-1)
(modified) llvm/test/CodeGen/AMDGPU/call-preserved-registers.ll (+60-60)
(modified) llvm/test/CodeGen/AMDGPU/callee-frame-setup.ll (+96-96)
(modified) llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs-packed.ll (+23-23)
(modified) llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll (+23-23)
(modified) llvm/test/CodeGen/AMDGPU/cross-block-use-is-not-abi-copy.ll (+12-12)
(modified) llvm/test/CodeGen/AMDGPU/debug-frame.ll (+491-15)
(modified) llvm/test/CodeGen/AMDGPU/dwarf-multi-register-use-crash.ll (+49-33)
(modified) llvm/test/CodeGen/AMDGPU/dynamic-vgpr-reserve-stack-for-cwsr.ll (+8-10)
(modified) llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-mov-b32.mir (+96)
(modified) llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll (+5-5)
(modified) llvm/test/CodeGen/AMDGPU/frame-index.mir (+96)
(modified) llvm/test/CodeGen/AMDGPU/frame-setup-without-sgpr-to-vgpr-spills.ll (+7-22)
(modified) llvm/test/CodeGen/AMDGPU/function-args-inreg.ll (+15-14)
(modified) llvm/test/CodeGen/AMDGPU/gfx-call-non-gfx-func.ll (+74-74)
(modified) llvm/test/CodeGen/AMDGPU/gfx-callable-argument-types.ll (+2269-2280)
(modified) llvm/test/CodeGen/AMDGPU/gfx-callable-preserved-registers.ll (+118-121)
(modified) llvm/test/CodeGen/AMDGPU/gfx-callable-return-types.ll (+139-83)
(modified) llvm/test/CodeGen/AMDGPU/global-alias.ll (+1-1)
(modified) llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll (+46-46)
(modified) llvm/test/CodeGen/AMDGPU/indirect-call.ll (+552-552)
(modified) llvm/test/CodeGen/AMDGPU/insert-delay-alu-bug.ll (+3-4)
(modified) llvm/test/CodeGen/AMDGPU/insert-waitcnts-crash.ll (+9-8)
(modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.readfirstlane.ll (+16-16)
(modified) llvm/test/CodeGen/AMDGPU/llvm.maximum.f64.ll (+6-6)
(modified) llvm/test/CodeGen/AMDGPU/llvm.minimum.f64.ll (+6-6)
(modified) llvm/test/CodeGen/AMDGPU/materialize-frame-index-sgpr.gfx10.ll (+51-49)
(modified) llvm/test/CodeGen/AMDGPU/materialize-frame-index-sgpr.ll (+826-825)
(modified) llvm/test/CodeGen/AMDGPU/maximumnum.bf16.ll (+14-14)
(modified) llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll (+3-3)
(modified) llvm/test/CodeGen/AMDGPU/minimumnum.bf16.ll (+14-14)
(modified) llvm/test/CodeGen/AMDGPU/mul24-pass-ordering.ll (+13-13)
(modified) llvm/test/CodeGen/AMDGPU/need-fp-from-vgpr-spills.ll (+3-3)
(modified) llvm/test/CodeGen/AMDGPU/nested-calls.ll (+6-6)
(modified) llvm/test/CodeGen/AMDGPU/no-source-locations-in-prologue.ll (+2-1)
(modified) llvm/test/CodeGen/AMDGPU/pei-amdgpu-cs-chain-preserve.mir (+1)
(modified) llvm/test/CodeGen/AMDGPU/pei-vgpr-block-spill-csr.mir (+320)
(modified) llvm/test/CodeGen/AMDGPU/preserve-wwm-copy-dst-reg.ll (+9-24)
(modified) llvm/test/CodeGen/AMDGPU/s-getpc-b64-remat.ll (+6-7)
(modified) llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir (+162-95)
(modified) llvm/test/CodeGen/AMDGPU/sgpr-spills-split-regalloc.ll (+6-21)
(modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2i64.v8i64.ll (+237-205)
(modified) llvm/test/CodeGen/AMDGPU/si-lower-sgpr-spills-vgpr-lanes-usage.mir (+12-9)
(modified) llvm/test/CodeGen/AMDGPU/si-lower-sgpr-spills.mir (+5)
(modified) llvm/test/CodeGen/AMDGPU/sibling-call.ll (+123-123)
(modified) llvm/test/CodeGen/AMDGPU/spill-partial-csr-sgpr-live-ins.mir (+5)
(modified) llvm/test/CodeGen/AMDGPU/spill-sgpr-csr-live-ins.mir (+1)
(modified) llvm/test/CodeGen/AMDGPU/spill-sgpr-to-virtual-vgpr.mir (+16)
(modified) llvm/test/CodeGen/AMDGPU/spill-vgpr-block.ll (+3-3)
(modified) llvm/test/CodeGen/AMDGPU/spill_more_than_wavesize_csr_sgprs.ll (+1-1)
(modified) llvm/test/CodeGen/AMDGPU/stack-realign.ll (+9-9)
(modified) llvm/test/CodeGen/AMDGPU/stacksave_stackrestore.ll (+11-11)
(modified) llvm/test/CodeGen/AMDGPU/strictfp_f16_abi_promote.ll (+27-27)
(modified) llvm/test/CodeGen/AMDGPU/swdev504645-global-fold.ll (+4-3)
(modified) llvm/test/CodeGen/AMDGPU/tail-call-inreg-arguments.error.ll (+5-5)
(modified) llvm/test/CodeGen/AMDGPU/tuple-allocation-failure.ll (+2-2)
(modified) llvm/test/CodeGen/AMDGPU/unfold-masked-merge-scalar-variablemask.ll (+18-20)
(modified) llvm/test/CodeGen/AMDGPU/unspill-vgpr-after-rewrite-vgpr-mfma.ll (+10-8)
(modified) llvm/test/CodeGen/AMDGPU/unstructured-cfg-def-use-issue.ll (+84-84)
(modified) llvm/test/CodeGen/AMDGPU/vgpr-tuple-allocation.ll (+44-37)
(modified) llvm/test/CodeGen/AMDGPU/wave32.ll (+6-6)
(modified) llvm/test/CodeGen/AMDGPU/whole-wave-functions.ll (+50-54)
(modified) llvm/test/CodeGen/AMDGPU/whole-wave-register-copy.ll (+2-2)
(modified) llvm/test/CodeGen/AMDGPU/whole-wave-register-spill.ll (+3-3)
(modified) llvm/test/CodeGen/AMDGPU/wwm-reserved-spill.ll (+4-4)

diff --git a/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp b/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
index 5a0b1afbdfdff..bbde3c49f64c6 100644
--- a/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
@@ -2244,17 +2244,49 @@ bool SIFrameLowering::allocateScavengingFrameIndexesNearIncomingSP(
   return true;
 }
 
+static bool isLiveIntoMBB(MCRegister Reg, MachineBasicBlock &MBB,
+                          const TargetRegisterInfo *TRI) {
+  for (MCRegAliasIterator R(Reg, TRI, true); R.isValid(); ++R) {
+    if (MBB.isLiveIn(*R)) {
+      return true;
+    }
+  }
+  return false;
+}
+
 bool SIFrameLowering::spillCalleeSavedRegisters(
     MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,
     ArrayRef<CalleeSavedInfo> CSI, const TargetRegisterInfo *TRI) const {
   MachineFunction *MF = MBB.getParent();
   const GCNSubtarget &ST = MF->getSubtarget<GCNSubtarget>();
-  if (!ST.useVGPRBlockOpsForCSR())
-    return false;
+  const SIInstrInfo *TII = ST.getInstrInfo();
+  const SIRegisterInfo *SITRI = static_cast<const SIRegisterInfo *>(TRI);
+
+  if (!ST.useVGPRBlockOpsForCSR()) {
+    for (const CalleeSavedInfo &CS : CSI) {
+      // Insert the spill to the stack frame.
+      unsigned Reg = CS.getReg();
+
+      if (CS.isSpilledToReg()) {
+        BuildMI(MBB, MI, DebugLoc(), TII->get(TargetOpcode::COPY),
+                CS.getDstReg())
+            .addReg(Reg, getKillRegState(true));
+      } else {
+        const TargetRegisterClass *RC = TRI->getMinimalPhysRegClass(
+            Reg, Reg == SITRI->getReturnAddressReg(*MF) ? MVT::i64 : MVT::i32);
+        // If this value was already livein, we probably have a direct use of
+        // the incoming register value, so don't kill at the spill point. This
+        // happens since we pass some special inputs (workgroup IDs) in the
+        // callee saved range.
+        const bool IsLiveIn = isLiveIntoMBB(Reg, MBB, TRI);
+        TII->storeRegToStackSlotCFI(MBB, MI, Reg, !IsLiveIn, CS.getFrameIdx(),
+                                    RC, TRI);
+      }
+    }
+    return true;
+  }
 
   MachineFrameInfo &FrameInfo = MF->getFrameInfo();
-  SIMachineFunctionInfo *MFI = MF->getInfo<SIMachineFunctionInfo>();
-  const SIInstrInfo *TII = ST.getInstrInfo();
   SIMachineFunctionInfo *FuncInfo = MF->getInfo<SIMachineFunctionInfo>();
 
   const TargetRegisterClass *BlockRegClass =
@@ -2278,10 +2310,10 @@ bool SIFrameLowering::spillCalleeSavedRegisters(
                                  FrameInfo.getObjectAlign(FrameIndex));
 
     BuildMI(MBB, MI, MI->getDebugLoc(),
-            TII->get(AMDGPU::SI_BLOCK_SPILL_V1024_SAVE))
+            TII->get(AMDGPU::SI_BLOCK_SPILL_V1024_CFI_SAVE))
         .addReg(Reg, getKillRegState(false))
         .addFrameIndex(FrameIndex)
-        .addReg(MFI->getStackPtrOffsetReg())
+        .addReg(FuncInfo->getStackPtrOffsetReg())
         .addImm(0)
         .addImm(Mask)
         .addMemOperand(MMO);
@@ -2467,6 +2499,22 @@ MachineInstr *SIFrameLowering::buildCFI(MachineBasicBlock &MBB,
       .setMIFlag(flag);
 }
 
+MachineInstr *SIFrameLowering::buildCFIForVRegToVRegSpill(
+    MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
+    const DebugLoc &DL, const Register Reg, const Register RegCopy) const {
+  MachineFunction &MF = *MBB.getParent();
+  const MCRegisterInfo &MCRI = *MF.getContext().getRegisterInfo();
+  const GCNSubtarget &ST = MF.getSubtarget<GCNSubtarget>();
+
+  unsigned MaskReg = MCRI.getDwarfRegNum(
+      ST.isWave32() ? AMDGPU::EXEC_LO : AMDGPU::EXEC, false);
+  auto CFIInst = MCCFIInstruction::createLLVMVectorRegisterMask(
+      nullptr, MCRI.getDwarfRegNum(Reg, false),
+      MCRI.getDwarfRegNum(RegCopy, false), VGPRLaneBitSize, MaskReg,
+      ST.getWavefrontSize());
+  return buildCFI(MBB, MBBI, DL, std::move(CFIInst));
+}
+
 MachineInstr *SIFrameLowering::buildCFIForSGPRToVGPRSpill(
     MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
     const DebugLoc &DL, const Register SGPR, const Register VGPR,
@@ -2515,6 +2563,34 @@ MachineInstr *SIFrameLowering::buildCFIForSGPRToVGPRSpill(
   return buildCFI(MBB, MBBI, DL, std::move(CFIInst));
 }
 
+MachineInstr *SIFrameLowering::buildCFIForSGPRToVMEMSpill(
+    MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
+    const DebugLoc &DL, unsigned SGPR, int64_t Offset) const {
+  MachineFunction &MF = *MBB.getParent();
+  const MCRegisterInfo &MCRI = *MF.getContext().getRegisterInfo();
+  return buildCFI(MBB, MBBI, DL,
+                  llvm::MCCFIInstruction::createOffset(
+                      nullptr, MCRI.getDwarfRegNum(SGPR, false), Offset));
+}
+
+MachineInstr *SIFrameLowering::buildCFIForVGPRToVMEMSpill(
+    MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
+    const DebugLoc &DL, unsigned VGPR, int64_t Offset) const {
+  const MachineFunction &MF = *MBB.getParent();
+  const MCRegisterInfo &MCRI = *MF.getContext().getRegisterInfo();
+  const GCNSubtarget &ST = MF.getSubtarget<GCNSubtarget>();
+
+  int DwarfVGPR = MCRI.getDwarfRegNum(VGPR, false);
+  assert(DwarfVGPR != -1);
+
+  unsigned MaskReg = MCRI.getDwarfRegNum(
+      ST.isWave32() ? AMDGPU::EXEC_LO : AMDGPU::EXEC, false);
+  auto CFIInst = MCCFIInstruction::createLLVMVectorOffset(
+      nullptr, DwarfVGPR, VGPRLaneBitSize, MaskReg, ST.getWavefrontSize(),
+      Offset);
+  return buildCFI(MBB, MBBI, DL, std::move(CFIInst));
+}
+
 MachineInstr *SIFrameLowering::buildCFIForRegToSGPRPairSpill(
     MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
     const DebugLoc &DL, const Register Reg, const Register SGPRPair) const {
@@ -2535,3 +2611,14 @@ MachineInstr *SIFrameLowering::buildCFIForRegToSGPRPairSpill(
       nullptr, DwarfReg, DwarfSGPR0, SGPRBitSize, DwarfSGPR1, SGPRBitSize);
   return buildCFI(MBB, MBBI, DL, std::move(CFIInst));
 }
+
+MachineInstr *
+SIFrameLowering::buildCFIForSameValue(MachineBasicBlock &MBB,
+                                      MachineBasicBlock::iterator MBBI,
+                                      const DebugLoc &DL, Register Reg) const {
+  const MachineFunction &MF = *MBB.getParent();
+  const MCRegisterInfo &MCRI = *MF.getContext().getRegisterInfo();
+  int DwarfReg = MCRI.getDwarfRegNum(Reg, /*isEH=*/false);
+  auto CFIInst = MCCFIInstruction::createSameValue(nullptr, DwarfReg);
+  return buildCFI(MBB, MBBI, DL, std::move(CFIInst));
+}
diff --git a/llvm/lib/Target/AMDGPU/SIFrameLowering.h b/llvm/lib/Target/AMDGPU/SIFrameLowering.h
index 20f608f2dfc24..2b716db0b7a22 100644
--- a/llvm/lib/Target/AMDGPU/SIFrameLowering.h
+++ b/llvm/lib/Target/AMDGPU/SIFrameLowering.h
@@ -120,6 +120,13 @@ class SIFrameLowering final : public AMDGPUFrameLowering {
            const DebugLoc &DL, const MCCFIInstruction &CFIInst,
            MachineInstr::MIFlag flag = MachineInstr::FrameSetup) const;
 
+  /// Create a CFI index describing a spill of the VGPR/AGPR \p Reg to another
+  /// VGPR/AGPR \p RegCopy and build a MachineInstr around it.
+  MachineInstr *buildCFIForVRegToVRegSpill(MachineBasicBlock &MBB,
+                                           MachineBasicBlock::iterator MBBI,
+                                           const DebugLoc &DL,
+                                           const Register Reg,
+                                           const Register RegCopy) const;
   /// Create a CFI index describing a spill of an SGPR to a single lane of
   /// a VGPR and build a MachineInstr around it.
   MachineInstr *buildCFIForSGPRToVGPRSpill(MachineBasicBlock &MBB,
@@ -134,10 +141,25 @@ class SIFrameLowering final : public AMDGPUFrameLowering {
       MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
       const DebugLoc &DL, Register SGPR,
       ArrayRef<SIRegisterInfo::SpilledReg> VGPRSpills) const;
+  /// Create a CFI index describing a spill of a SGPR to VMEM and
+  /// build a MachineInstr around it.
+  MachineInstr *buildCFIForSGPRToVMEMSpill(MachineBasicBlock &MBB,
+                                           MachineBasicBlock::iterator MBBI,
+                                           const DebugLoc &DL, unsigned SGPR,
+                                           int64_t Offset) const;
+  /// Create a CFI index describing a spill of a VGPR to VMEM and
+  /// build a MachineInstr around it.
+  MachineInstr *buildCFIForVGPRToVMEMSpill(MachineBasicBlock &MBB,
+                                           MachineBasicBlock::iterator MBBI,
+                                           const DebugLoc &DL, unsigned VGPR,
+                                           int64_t Offset) const;
   MachineInstr *buildCFIForRegToSGPRPairSpill(MachineBasicBlock &MBB,
                                               MachineBasicBlock::iterator MBBI,
                                               const DebugLoc &DL, Register Reg,
                                               Register SGPRPair) const;
+  MachineInstr *buildCFIForSameValue(MachineBasicBlock &MBB,
+                                     MachineBasicBlock::iterator MBBI,
+                                     const DebugLoc &DL, Register Reg) const;
   // Returns true if the function may need to reserve space on the stack for the
   // CWSR trap handler.
   bool mayReserveScratchForCWSR(const MachineFunction &MF) const;
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index d930a21c2d7f5..a097e721d142f 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -1530,22 +1530,26 @@ SIInstrInfo::getIndirectRegWriteMovRelPseudo(unsigned VecSize, unsigned EltSize,
   return get(getIndirectVGPRWriteMovRelPseudoOpc(VecSize));
 }
 
-static unsigned getSGPRSpillSaveOpcode(unsigned Size) {
+static unsigned getSGPRSpillSaveOpcode(unsigned Size, bool NeedsCFI) {
   switch (Size) {
   case 4:
-    return AMDGPU::SI_SPILL_S32_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_S32_CFI_SAVE : AMDGPU::SI_SPILL_S32_SAVE;
   case 8:
-    return AMDGPU::SI_SPILL_S64_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_S64_CFI_SAVE : AMDGPU::SI_SPILL_S64_SAVE;
   case 12:
-    return AMDGPU::SI_SPILL_S96_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_S96_CFI_SAVE : AMDGPU::SI_SPILL_S96_SAVE;
   case 16:
-    return AMDGPU::SI_SPILL_S128_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_S128_CFI_SAVE
+                    : AMDGPU::SI_SPILL_S128_SAVE;
   case 20:
-    return AMDGPU::SI_SPILL_S160_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_S160_CFI_SAVE
+                    : AMDGPU::SI_SPILL_S160_SAVE;
   case 24:
-    return AMDGPU::SI_SPILL_S192_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_S192_CFI_SAVE
+                    : AMDGPU::SI_SPILL_S192_SAVE;
   case 28:
-    return AMDGPU::SI_SPILL_S224_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_S224_CFI_SAVE
+                    : AMDGPU::SI_SPILL_S224_SAVE;
   case 32:
     return AMDGPU::SI_SPILL_S256_SAVE;
   case 36:
@@ -1557,69 +1561,90 @@ static unsigned getSGPRSpillSaveOpcode(unsigned Size) {
   case 48:
     return AMDGPU::SI_SPILL_S384_SAVE;
   case 64:
-    return AMDGPU::SI_SPILL_S512_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_S512_CFI_SAVE
+                    : AMDGPU::SI_SPILL_S512_SAVE;
   case 128:
-    return AMDGPU::SI_SPILL_S1024_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_S1024_CFI_SAVE
+                    : AMDGPU::SI_SPILL_S1024_SAVE;
   default:
     llvm_unreachable("unknown register size");
   }
 }
 
-static unsigned getVGPRSpillSaveOpcode(unsigned Size) {
+static unsigned getVGPRSpillSaveOpcode(unsigned Size, bool NeedsCFI) {
   switch (Size) {
   case 2:
     return AMDGPU::SI_SPILL_V16_SAVE;
   case 4:
-    return AMDGPU::SI_SPILL_V32_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_V32_CFI_SAVE : AMDGPU::SI_SPILL_V32_SAVE;
   case 8:
-    return AMDGPU::SI_SPILL_V64_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_V64_CFI_SAVE : AMDGPU::SI_SPILL_V64_SAVE;
   case 12:
-    return AMDGPU::SI_SPILL_V96_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_V96_CFI_SAVE : AMDGPU::SI_SPILL_V96_SAVE;
   case 16:
-    return AMDGPU::SI_SPILL_V128_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_V128_CFI_SAVE
+                    : AMDGPU::SI_SPILL_V128_SAVE;
   case 20:
-    return AMDGPU::SI_SPILL_V160_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_V160_CFI_SAVE
+                    : AMDGPU::SI_SPILL_V160_SAVE;
   case 24:
-    return AMDGPU::SI_SPILL_V192_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_V192_CFI_SAVE
+                    : AMDGPU::SI_SPILL_V192_SAVE;
   case 28:
-    return AMDGPU::SI_SPILL_V224_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_V224_CFI_SAVE
+                    : AMDGPU::SI_SPILL_V224_SAVE;
   case 32:
-    return AMDGPU::SI_SPILL_V256_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_V256_CFI_SAVE
+                    : AMDGPU::SI_SPILL_V256_SAVE;
   case 36:
-    return AMDGPU::SI_SPILL_V288_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_V288_CFI_SAVE
+                    : AMDGPU::SI_SPILL_V288_SAVE;
   case 40:
-    return AMDGPU::SI_SPILL_V320_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_V320_CFI_SAVE
+                    : AMDGPU::SI_SPILL_V320_SAVE;
   case 44:
-    return AMDGPU::SI_SPILL_V352_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_V352_CFI_SAVE
+                    : AMDGPU::SI_SPILL_V352_SAVE;
   case 48:
-    return AMDGPU::SI_SPILL_V384_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_V384_CFI_SAVE
+                    : AMDGPU::SI_SPILL_V384_SAVE;
   case 64:
-    return AMDGPU::SI_SPILL_V512_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_V512_CFI_SAVE
+                    : AMDGPU::SI_SPILL_V512_SAVE;
   case 128:
-    return AMDGPU::SI_SPILL_V1024_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_V1024_CFI_SAVE
+                    : AMDGPU::SI_SPILL_V1024_SAVE;
   default:
     llvm_unreachable("unknown register size");
   }
 }
 
-static unsigned getAVSpillSaveOpcode(unsigned Size) {
+static unsigned getAVSpillSaveOpcode(unsigned Size, bool NeedsCFI) {
   switch (Size) {
   case 4:
-    return AMDGPU::SI_SPILL_AV32_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_AV32_CFI_SAVE
+                    : AMDGPU::SI_SPILL_AV32_SAVE;
   case 8:
-    return AMDGPU::SI_SPILL_AV64_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_AV64_CFI_SAVE
+                    : AMDGPU::SI_SPILL_AV64_SAVE;
   case 12:
-    return AMDGPU::SI_SPILL_AV96_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_AV96_CFI_SAVE
+                    : AMDGPU::SI_SPILL_AV96_SAVE;
   case 16:
-    return AMDGPU::SI_SPILL_AV128_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_AV128_CFI_SAVE
+                    : AMDGPU::SI_SPILL_AV128_SAVE;
   case 20:
-    return AMDGPU::SI_SPILL_AV160_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_AV160_CFI_SAVE
+                    : AMDGPU::SI_SPILL_AV160_SAVE;
   case 24:
-    return AMDGPU::SI_SPILL_AV192_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_AV192_CFI_SAVE
+                    : AMDGPU::SI_SPILL_AV192_SAVE;
   case 28:
-    return AMDGPU::SI_SPILL_AV224_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_AV224_CFI_SAVE
+                    : AMDGPU::SI_SPILL_AV224_SAVE;
   case 32:
-    return AMDGPU::SI_SPILL_AV256_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_AV256_CFI_SAVE
+                    : AMDGPU::SI_SPILL_AV256_SAVE;
   case 36:
     return AMDGPU::SI_SPILL_AV288_SAVE;
   case 40:
@@ -1629,9 +1654,11 @@ static unsigned getAVSpillSaveOpcode(unsigned Size) {
   case 48:
     return AMDGPU::SI_SPILL_AV384_SAVE;
   case 64:
-    return AMDGPU::SI_SPILL_AV512_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_AV512_CFI_SAVE
+                    : AMDGPU::SI_SPILL_AV512_SAVE;
   case 128:
-    return AMDGPU::SI_SPILL_AV1024_SAVE;
+    return NeedsCFI ? AMDGPU::SI_SPILL_AV1024_CFI_SAVE
+                    : AMDGPU::SI_SPILL_AV1024_SAVE;
   default:
     llvm_unreachable("unknown register size");
   }
@@ -1651,7 +1678,7 @@ static unsigned getWWMRegSpillSaveOpcode(unsigned Size,
 
 unsigned SIInstrInfo::getVectorRegSpillSaveOpcode(
     Register Reg, const TargetRegisterClass *RC, unsigned Size,
-    const SIMachineFunctionInfo &MFI) const {
+    const SIMachineFunctionInfo &MFI, bool NeedsCFI) const {
   bool IsVectorSuperClass = RI.isVectorSuperClass(RC);
 
   // Choose the right opcode if spilling a WWM register.
@@ -1660,16 +1687,16 @@ unsigned SIInstrInfo::getVectorRegSpillSaveOpcode(
 
   // TODO: Check if AGPRs are available
   if (ST.hasMAIInsts())
-    return getAVSpillSaveOpcode(Size);
+    return getAVSpillSaveOpcode(Size, NeedsCFI);
 
-  return getVGPRSpillSaveOpcode(Size);
+  return getVGPRSpillSaveOpcode(Size, NeedsCFI);
 }
 
-void SIInstrInfo::storeRegToStackSlot(
+void SIInstrInfo::storeRegToStackSlotImpl(
     MachineBasicBlock &MBB, MachineBasicBlock::iterator MI, Register SrcReg,
     bool isKill, int FrameIndex, const TargetRegisterClass *RC,
-    const TargetRegisterInfo *TRI, Register VReg,
-    MachineInstr::MIFlag Flags) const {
+    const TargetRegisterInfo *TRI, Register VReg, MachineInstr::MIFlag Flags,
+    bool NeedsCFI) const {
   MachineFunction *MF = MBB.getParent();
   SIMachineFunctionInfo *MFI = MF->getInfo<SIMachineFunctionInfo>();
   MachineFrameInfo &FrameInfo = MF->getFrameInfo();
@@ -1691,7 +1718,8 @@ void SIInstrInfo::storeRegToStackSlot(
 
     // We are only allowed to create one new instruction when spilling
     // registers, so we need to use pseudo instruction for spilling SGPRs.
-    const MCInstrDesc &OpDesc = get(getSGPRSpillSaveOpcode(SpillSize));
+    const MCInstrDesc &OpDesc =
+        get(getSGPRSpillSaveOpcode(SpillSize, NeedsCFI));
 
     // The SGPR spill/restore instructions only work on number sgprs, so we need
     // to make sure we are using the correct register class.
@@ -1710,8 +1738,8 @@ void SIInstrInfo::storeRegToStackSlot(
     return;
   }
 
-  unsigned Opcode =
-      getVectorRegSpillSaveOpcode(VReg ? VReg : SrcReg, RC, SpillSize, *MFI);
+  unsigned Opcode = getVectorRegSpillSaveOpcode(VReg ? VReg : SrcReg, RC,
+                                                SpillSize, *MFI, NeedsCFI);
   MFI->setHasSpilledVGPRs();
 
   BuildMI(MBB, MI, DL, get(Opcode))
@@ -1722,6 +1750,25 @@ void SIInstrInfo::storeRegToStackSlot(
     .addMemOperand(MMO);
 }
 
+void SIInstrInfo::storeRegToStackSlot(
+    MachineBasicBlock &MBB, MachineBasicBlock::iterator MI, Register SrcReg,
+    bool isKill, int FrameIndex, const TargetRegisterClass *RC,
+    const TargetRegisterInfo *TRI, Register VReg,
+    MachineInstr::MIFlag Flags) const {
+  storeRegToStackSlotImpl(MBB, MI, SrcReg, isKill, FrameIndex, RC, TRI, VReg,
+                          Flags, false);
+}
+
+void SIInstrInfo::storeRegToStackSlotCFI(MachineBasicBlock &MBB,
+                                         MachineBasicBlock::iterator MI,
+                                         Register SrcReg, bool isKill,
+                                         int FrameIndex,
+                                         const TargetRegisterClass *RC,
+                                         const TargetRegisterInfo *TRI) const {
+  storeRegToStackSlotImpl(MBB, MI, SrcReg, isKill, FrameIndex, RC, TRI,
+                          Register(), MachineInstr::NoFlags, true);
+}
+
 static unsigned getSGPRSpillRestoreOpcode(unsigned Size) {
   switch (Size) {
   case 4:
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
index 5fdeddaf3f736..9c0a80bbcecda 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
@@ -293,13 +293,29 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo {
                     MachineBasicBlock::iterator I, const DebugLoc &DL,
                     Register SrcReg, int Value)  const;
 
+private:
+  void storeRegToStackSlotImpl(MachineBasicBlock &MBB,
+                               MachineBasicBlock::iterator MI, Register SrcReg,
+                               bool isKill, int FrameIndex,
+                               const TargetRegisterClass *RC,
+                               const TargetRegisterInfo *TRI, Register VReg,
+                               MachineInstr::MIFlag Flags, bool NeedsCFI) const;
+
+public:
+  void storeRegToStackSlotCFI(MachineBasicBlock &MBB,
+                              MachineBasicBlock::iterator MI, Register SrcReg,
+                              bool isKill, int FrameIndex,
+                              const TargetRegisterClass *RC,
+                              const TargetRegisterInfo *TRI) const;
+
   bool getConstValDefinedInReg(const MachineInstr &MI, cons...
[truncated]

arsenm · 2025-10-23T03:38:30Z

llvm/lib/Target/AMDGPU/SIFrameLowering.cpp

  return true;
 }

+static bool isLiveIntoMBB(MCRegister Reg, MachineBasicBlock &MBB,


This shouldn't be here. I'm still unclear on what the rules are for when register aliases will appear in the live in list. This also should account for the LaneMask

Which LaneMask should it account for? Isn't a live-in conservatively assumed LaneBitmask::getAll(), so any use would mean we can't mark the spill operand kill?

I am a bit out of my depth, I don't have a great grasp on subreg liveness tracking, so any help understanding is appreciated

For the prolog/epilog case, the lanemask probably doesn't matter?

I'm more worried about the usage of MCRegAliasIterator. I would hope you wouldn't need to invent this helper here. For cases like a livein list including $vgpr0_vgpr1, I'm not sure if it is supposed to also include $vgpr0 and $vgpr1 as separate entries. MCRegAliasIterator is actually really expensive, there are a lot of overlapping registers.

Not sure where the functionality fits best, maybe MachineRegisterInfo, or TargetLowering?

It seems like so far this is the best approach we have? Same issue as in #129848

Maybe we bite the bullet and use MCRegAliasIterator for now and work out a more efficient representation?

Other targets also implement spillCalleeSavedRegisters in a similar way, of course their MCRegAliasIterator is presumably much cheaper.

For example X86:

// Update LiveIn of the basic block and decide whether we can add a kill flag // to the use. auto UpdateLiveInCheckCanKill = [&](Register Reg) { const MachineRegisterInfo &MRI = MF.getRegInfo(); // Do not set a kill flag on values that are also marked as live-in. This // happens with the @llvm-returnaddress intrinsic and with arguments // passed in callee saved registers. // Omitting the kill flags is conservatively correct even if the live-in // is not used after all. if (MRI.isLiveIn(Reg)) return false; MBB.addLiveIn(Reg); // Check if any subregister is live-in for (MCRegAliasIterator AReg(Reg, TRI, false); AReg.isValid(); ++AReg) if (MRI.isLiveIn(*AReg)) return false; return true; }; auto UpdateLiveInGetKillRegState = [&](Register Reg) { return getKillRegState(UpdateLiveInCheckCanKill(Reg)); };

Others don't consider aliases, like ARM:

const MachineRegisterInfo &MRI = MF.getRegInfo(); bool isLiveIn = MRI.isLiveIn(Reg); if (!isLiveIn && !MRI.isReserved(PhysReg: Reg)) MBB.addLiveIn(PhysReg: Reg); // If NoGap is true, push consecutive registers and then leave the rest // for other instructions. e.g. // vpush {d8, d10, d11} -> vpush {d8}, vpush {d10, d11} if (NoGap && LastReg && LastReg != Reg-1) break; LastReg = Reg; // Do not set a kill flag on values that are also marked as live-in. This // happens with the @llvm-returnaddress intrinsic and with arguments // passed in callee saved registers. // Omitting the kill flags is conservatively correct even if the live-in // is not used after all. Regs.push_back(std::make_pair(Reg, /*isKill=*/!isLiveIn));

and AArch64:

static unsigned getPrologueDeath(MachineFunction &MF, unsigned Reg) { // Do not set a kill flag on values that are also marked as live-in. This // happens with the @llvm-returnaddress intrinsic and with arguments passed in // callee saved registers. // Omitting the kill flags is conservatively correct even if the live-in // is not used after all. bool IsLiveIn = MF.getRegInfo().isLiveIn(Reg); return getKillRegState(B: !IsLiveIn); }

For cases like a livein list including $vgpr0_vgpr1, I'm not sure if it is supposed to also include $vgpr0 and $vgpr1 as separate entries.

Thinking on this point in particular, I don't know of a more efficient way to "expand" the livein list to include all such aliases. It seems like we either pay for it up-front and expand the livein list, or we pay for it at the point we look at the CSRs.

If we expect the livein list to generally be smaller (and potentially have useful LaneBitmasks we can filter units by, even in the prologue?) then we could generate a SparseBitVector of all the physical registers which the livein list implies should not be killed if they are spilled.

I'll try that, but I'm unsure if the result will actually be cheaper. In particular it seems like we have a cache for MCRegAliasIterator and this would bypass that. The only benefit would be that we could consider the LaneMask on the liveins themselves, and I'm not sure how useful that is in thinning out the set of aliases.

Maybe we can simplify the problem somewhat by only enumerating root registers, assuming the CSRs themselves are roots? Or we could ensure the actual LiveIns list contains the redundant roots before PrologEpilog. I will see if this is a viable approach, since the alternatives I've looked at so far seem like they are essentially no better than using MCRegAliasIterator

arsenm · 2025-10-23T03:40:25Z

llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs-packed.ll

 ; GCN-NEXT:    buffer_store_dword v40, off, s[0:3], s33 ; 4-byte Folded Spill
 ; GCN-NEXT:    s_mov_b64 exec, s[6:7]
 ; GCN-NEXT:    v_writelane_b32 v40, s4, 2
+; GCN-NEXT:    v_writelane_b32 v40, s30, 0


I'm not sure why this results in all of these scheduling changes

There are a few things leading to the scheduling changes, and I will split up this change into multiple to tease them apart so we can consider them independently.

The biggest issue is that the postmisched which runs after prologepilog (conservatively) sees the CFI_INSTRUCTIONs as scheduling boundaries, so e.g. the sink of v_readlane_b32 s30, v40, 0 from its original spot in the prolog past the CFI_INSTRUCTION which describes it is not valid.

This is why we force the CFI emission in all cases for AMDGPU, because we really want to avoid spurious scheduling changes which only occur when debug-info is enabled. If enabling debug-info can actually change program behavior then it isn't terribly useful. I do think we should move to just making the default be CFI-enabled, and still respect nounwind for e.g. test cases.

I dug up https://bugs.llvm.org/show_bug.cgi?id=37240 and https://reviews.llvm.org/D71350 concerning the larger problem, and AFAIK there isn't a great general solution yet. X86 seems to dodge the issue by not enabling a post-RA scheduler by default. arm/aarch64 just seem to have the bug currently.

I spent a couple days trying to sketch out what a better fix might look like, but it will be a large effort, and I would rather get the changes we have already been relying on for unwind info upstream first, if that is acceptable.

epilk and others added 5 commits October 22, 2025 21:27

This was referenced Oct 22, 2025

[MC] Use a variant to hold MCCFIInstruction state (NFC) #164720

Open

[MC][Dwarf] Add custom CFI pseudo-ops for use in AMDGPU #164721

Open

[AMDGPU] Emit entry function Dwarf CFI #164722

Open

This was referenced Oct 22, 2025

[AMDGPU] Implement CFI for non-kernel functions #164723

Open

[AMDGPU] Implement -amdgpu-spill-cfi-saved-regs #164725

Open

[AMDGPU][MC] Replace shifted registers in CFI instructions #164726

Open

slinder1 assigned kzhuravl, epilk, SLTozer, adrian-prantl, OCHyams and jmorse Oct 22, 2025

slinder1 marked this pull request as ready for review October 22, 2025 22:38

llvmbot added backend:AMDGPU llvm:globalisel labels Oct 22, 2025

arsenm reviewed Oct 23, 2025

View reviewed changes

slinder1 force-pushed the users/slinder1/amdgpu-cfi-4 branch from ee97ce9 to e811d05 Compare November 14, 2025 20:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] Implement CFI for CSR spills #164724

[AMDGPU] Implement CFI for CSR spills #164724

slinder1 commented Oct 22, 2025

Uh oh!

slinder1 commented Oct 22, 2025 •

edited

Loading

Uh oh!

llvmbot commented Oct 22, 2025 •

edited

Loading

Uh oh!

arsenm Oct 23, 2025

Uh oh!

slinder1 Nov 14, 2025

Uh oh!

arsenm Nov 15, 2025

Uh oh!

slinder1 Nov 17, 2025

Uh oh!

slinder1 Nov 19, 2025 •

edited

Loading

Uh oh!

slinder1 Nov 20, 2025

Uh oh!

arsenm Oct 23, 2025

Uh oh!

slinder1 Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

[AMDGPU] Implement CFI for CSR spills #164724

Are you sure you want to change the base?

[AMDGPU] Implement CFI for CSR spills #164724

Conversation

slinder1 commented Oct 22, 2025

Uh oh!

slinder1 commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

slinder1 Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

arsenm Nov 15, 2025

Choose a reason for hiding this comment

Uh oh!

slinder1 Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

slinder1 Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

slinder1 Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

arsenm Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

slinder1 Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

slinder1 commented Oct 22, 2025 •

edited

Loading

llvmbot commented Oct 22, 2025 •

edited

Loading

slinder1 Nov 19, 2025 •

edited

Loading